Summarize the data set data_eu.
The data_eu has 1638 objectives and 12 variables.
## [1] 1638 12
The variables in data_eu are:
## [1] "Country" "Year" "Gender"
## [4] "BMI_Index" "Bloodpressure" "Cholesterol"
## [7] "Sugar" "Food" "Income"
## [10] "Life_expectancy" "Region" "Period"
## 'data.frame': 1638 obs. of 12 variables:
## $ Country : Factor w/ 40 levels "Albania","Armenia",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Year : Factor w/ 25 levels "1980","1981",..: 1 1 2 2 3 3 4 4 5 5 ...
## $ Gender : Factor w/ 2 levels "female","male": 2 1 2 1 1 2 2 1 2 1 ...
## $ BMI_Index : num 25.2 25.2 25.2 25.2 25.2 ...
## $ Bloodpressure : num 133 132 133 132 132 ...
## $ Cholesterol : num 5.01 5.04 5 5.04 5.03 ...
## $ Sugar : num 46.6 46.6 46.6 46.6 46.6 ...
## $ Food : num 2660 2660 2748 2748 2692 ...
## $ Income : int 4218 4218 4227 4227 4237 4237 4248 4248 4259 4259 ...
## $ Life_expectancy: num 70.9 70.9 71 71 71 71 71 71 71.5 71.5 ...
## $ Region : Factor w/ 4 levels "Eastern Europe",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Period : Factor w/ 3 levels "1980 - 1989",..: 1 1 1 1 1 1 1 1 1 1 ...
The variable Country contains information for forty different European countries.
## [1] 40
These are:
## [1] "Albania" "Armenia"
## [3] "Austria" "Azerbaijan"
## [5] "Belarus" "Belgium"
## [7] "Bosnia and Herzegovina" "Bulgaria"
## [9] "Croatia" "Cyprus"
## [11] "Denmark" "Estonia"
## [13] "Finland" "France"
## [15] "Georgia" "Germany"
## [17] "Greece" "Hungary"
## [19] "Iceland" "Ireland"
## [21] "Italy" "Kazakhstan"
## [23] "Latvia" "Macedonia, FYR"
## [25] "Malta" "Moldova"
## [27] "Netherlands" "Norway"
## [29] "Poland" "Portugal"
## [31] "Romania" "Russia"
## [33] "Slovak Republic" "Slovenia"
## [35] "Spain" "Sweden"
## [37] "Switzerland" "Turkey"
## [39] "Ukraine" "United Kingdom"
The variable Year contains information for 25 years. The levels of variable Year are:
## [1] "1980" "1981" "1982" "1983" "1984" "1985" "1986" "1987" "1988" "1989"
## [11] "1990" "1991" "1992" "1993" "1994" "1995" "1996" "1997" "1998" "1999"
## [21] "2000" "2001" "2002" "2003" "2004"
The variable Gender has the following levels.
## [1] "female" "male"
The variable Region has 4 levels. These are:
## [1] "Eastern Europe" "Northern Europe" "Southern Europe" "Western Europe"
The variable Period has 3 levels. These are:
## [1] "1980 - 1989" "1990 - 1999" "2000 - 2004"
The summary of the data frame is shown below:
## Country Year Gender BMI_Index
## Albania : 50 1993 : 80 female:819 Min. :23.38
## Austria : 50 1994 : 80 male :819 1st Qu.:24.88
## Belgium : 50 1995 : 80 Median :25.33
## Bulgaria: 50 1996 : 80 Mean :25.37
## Cyprus : 50 1997 : 80 3rd Qu.:25.85
## Denmark : 50 1998 : 80 Max. :28.01
## (Other) :1338 (Other):1158
## Bloodpressure Cholesterol Sugar Food
## Min. :120.0 Min. :4.502 Min. : 5.48 Min. :1570
## 1st Qu.:129.7 1st Qu.:5.154 1st Qu.: 82.19 1st Qu.:2981
## Median :132.2 Median :5.365 Median :106.85 Median :3235
## Mean :132.0 Mean :5.385 Mean :102.22 Mean :3180
## 3rd Qu.:134.9 3rd Qu.:5.658 3rd Qu.:123.29 3rd Qu.:3441
## Max. :143.1 Max. :6.241 Max. :167.12 Max. :3817
##
## Income Life_expectancy Region Period
## Min. : 1466 Min. :62.70 Eastern Europe :638 1980 - 1989:500
## 1st Qu.:10858 1st Qu.:71.20 Northern Europe:250 1990 - 1999:738
## Median :21216 Median :75.00 Southern Europe:350 2000 - 2004:400
## Mean :21580 Mean :74.15 Western Europe :400
## 3rd Qu.:30785 3rd Qu.:77.30
## Max. :62370 Max. :81.10
##
The data frame data_eu contains 40 different European countries. The period which the data data_eu is gathered is from 1980 until 2004, that means 25 years and for both gender female and male. Roughly 75% have an BMI of more than 25, which actually is overweight. In average people have in all countries systolic blood above the recommend 120 mm Hg. The mean value of cholesterol is 5.385 mmol/L, age standardized mean, and indicates high risk for heart diseases. The average of sugar consumption per day and per person is 102.22 grams. The average daily kilocalorie consumption per person is 3180. The maximum average age is 81.1 years. The half of all people earn $21216 per year.
The distribution is negative skewed. The mean value is 74.15 and median value is 75.
The life expectancy of 76 has the highest frequency.
Which country has the lowest life expectancy in Europe?
## Country Year Gender BMI_Index Bloodpressure Cholesterol Sugar Food
## 1513 Turkey 1980 female 26.06155 127.1133 4.867945 65.75 3277.66
## 1514 Turkey 1980 male 23.66064 125.8545 4.812084 65.75 3277.66
## Income Life_expectancy Region Period
## 1513 7828 62.7 Southern Europe 1980 - 1989
## 1514 7828 62.7 Southern Europe 1980 - 1989
It is Turkey in 1980.
Which country has the highest life expectancy in Europe?
## Country Year Gender BMI_Index Bloodpressure Cholesterol Sugar Food
## 781 Iceland 2004 male 26.73403 129.7408 5.737782 153.43 3310.98
## 782 Iceland 2004 female 25.67006 119.9665 5.631419 153.43 3310.98
## Income Life_expectancy Region Period
## 781 37482 81.1 Northern Europe 2000 - 2004
## 782 37482 81.1 Northern Europe 2000 - 2004
It is Iceland in 2004. The results of life expectancy summary are shown below.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 62.70 71.20 75.00 74.15 77.30 81.10
Boxplot life expectancy by country between 1980-2004
The x-axis is ordered by median life expectancy of countries. Eastern European countries have the lowest median life expectancy.
But which countries have in average lower life expectancy than than the average life expectancy of all Europeans which is 74.14884 between 1980-2004?
## Source: local data frame [19 x 2]
##
## Country meanL
## (fctr) (dbl)
## 1 Kazakhstan 63.87692
## 2 Russia 65.63846
## 3 Azerbaijan 67.06154
## 4 Ukraine 67.46154
## 5 Turkey 67.99600
## 6 Belarus 68.35385
## 7 Moldova 68.43077
## 8 Latvia 68.88462
## 9 Estonia 69.54615
## 10 Romania 70.02000
## 11 Hungary 70.23600
## 12 Armenia 70.60000
## 13 Bulgaria 71.39200
## 14 Georgia 71.56154
## 15 Poland 72.06800
## 16 Bosnia and Herzegovina 72.91538
## 17 Slovak Republic 73.09167
## 18 Albania 73.14000
## 19 Macedonia, FYR 73.59231
It is interesting to see that all countries except of Turkey are Eastern European countries.
The histogram above looks like a normal distribution.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.502 5.154 5.365 5.385 5.658 6.241
There is no big difference between mean value (5.385) and median value (5.365). Only 25% of people have normal cholesterol values.
I wonder which country has the highest values.
## Country Year Gender BMI_Index Bloodpressure Cholesterol Sugar
## 1589 United Kingdom 1980 male 24.72216 136.5205 6.240528 117.81
## 1591 United Kingdom 1981 male 24.78911 136.3343 6.201839 120.55
## Food Income Life_expectancy Region Period
## 1589 3116.05 20417 73.4 Western Europe 1980 - 1989
## 1591 3099.19 20149 73.8 Western Europe 1980 - 1989
Male in United Kingdom between 1980-1981 have the highest cholesterol value, which is 6.240528.
Which country has the lowest cholesterol value?
## Country Year Gender BMI_Index Bloodpressure Cholesterol Sugar
## 151 Azerbaijan 2004 male 24.89376 131.3845 4.501741 43.84
## Food Income Life_expectancy Region Period
## 151 2894.6 6435 69.3 Eastern Europe 2000 - 2004
It is Azerbaijan in 2004.
Boxplot Cholesterol vs. Country
Which countries have normal cholesterol values?
## Source: local data frame [14 x 2]
##
## Country meanL
## (fctr) (dbl)
## 1 Azerbaijan 4.707317
## 2 Bosnia and Herzegovina 4.710209
## 3 Georgia 4.759566
## 4 Armenia 4.803656
## 5 Moldova 4.820544
## 6 Turkey 4.838926
## 7 Albania 4.952450
## 8 Kazakhstan 4.958597
## 9 Macedonia, FYR 4.970246
## 10 Ukraine 5.032866
## 11 Croatia 5.130145
## 12 Romania 5.145889
## 13 Russia 5.171549
## 14 Belarus 5.190451
Except of Turkey all countries with normal cholesterol values are Eastern European countries.
The histogram of variable Food is shown below.
The distribution is negative skewed with the highest calorie frequency at 3400.
The summary for Food can be seen below:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1570 2981 3235 3180 3441 3817
The data set data_eu does not make any distinction between female and male, age, height, weight and activity of each person regarding the daily need of food supply. The average energy supply for females and males are the same. According to the summary 50% of people have a daily intake of more than 3234.94 calories per day and on average 3179.7098779 calories which is high if one thinks that the average amount of calories for both sex is 2400 calories (men: 2700 calories, women: 2100 calories, average: 2400 calories).
The energy intake by year shows the next boxplot.
The average intake of food energy (red dots) is over the recommended for all years. But how does the intake of food energy looks by country?
The countries are sorted by median food consumption per person and day for all years between 1980-2004. In Eastern European countries food intake is less than the other countries.
Which European countries has normal average in calorie consumption?
## Source: local data frame [3 x 2]
##
## Country meanF
## (fctr) (dbl)
## 1 Armenia 2073.727
## 2 Georgia 2306.341
## 3 Azerbaijan 2363.422
The distribution is negative skewed. The highest frequency is around 120 grams per day and person. Which actually is higher than the recommended value.
Below the summary of Sugar.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.48 82.19 106.80 102.20 123.30 167.10
The median is 106.8 and the mean is 102.2. A healthy sugar consumption should be 5% of the total energy intake per person, that means 120 calories per person per day. Which is equivalent to 31 grams per day.
The x-axis is ordered by median sugar consumption per person and day for all years between 1980-2004. Estonia is the country with the highest degree of dispersion.
In which country is the sugar consumption 5.48 grams or 167.12 calories per day?
## Country Year Sugar Life_expectancy
## 1 Armenia 1993 5.48 69.2
## 2 Estonia 2003 167.12 71.4
## 3 Estonia 2004 167.12 71.9
Armenia is the country with the lowest sugar consumption for single years and also in average. The country with the highest sugar consumption for single years is Estonia. The highest median value has Iceland.
Is a positive skewed distribution. The highest peak shows that people have a BMI over 25 which means overweight.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 23.38 24.88 25.33 25.37 25.85 28.01
50% of the people have light overweight. Only 25% of all people in Europe have a normal valued BMI.
Between 1980-2004 the BMI index has increased.
The histogram of the systolic blood pressure shows that the distribution is negative skewed. 50% of the people in Europe has a systolic blood pressure 132 mm Hg (hypertensive crisis).
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 120.0 129.7 132.2 132.0 134.9 143.1
The summary confirmed this. The mean and median values are 132 mm Hg respective 132.2 mm Hg.
The likelihood that people get blood pressure values around 130 is very high.
How is the development of systolic blood pressure from 1980 until 2004?
From 1980 until 1991 the blood pressure is decreasing. In 1992 for first time a small increase in blood pressure is noticed. Thereafter again an decrease until 2004. Is the usage of medicine the reason of decreasing the systolic blood pressure?
How does it look for each country during the period 1980-2004?
The majority of countries have high systolic blood pressure. The red dot denotes the average value.
But which countries have a normal systolic blood pressure?
## Country Year
## 1 Iceland 2004
In 2004, Iceland is the only country with normal systolic blood pressure.
Which countries have an average systolic blood pressure greater than 140 mm Hg?
## Country
## 1 Finland
## 2 Germany
## 3 Norway
The next plot shows the histogram of variable Income.
The distribution of the income per person is skewed. Obviously there is an inequality of income. Which actually was expected due to difference in development of countries. Political, geographical and economical factors have also an impact on income.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1466 10860 21220 21580 30780 62370
Next I look closer to the outliers. I am interesting in which countries have a income of $1466 per year and person in average.
## Source: local data frame [2 x 12]
## Groups: Country [1]
##
## Country Year Gender BMI_Index Bloodpressure Cholesterol
## (fctr) (fctr) (fctr) (dbl) (dbl) (dbl)
## 1 Bosnia and Herzegovina 1993 female 25.20537 132.8245 4.808685
## 2 Bosnia and Herzegovina 1993 male 25.17015 131.9561 4.711708
## Variables not shown: Sugar (dbl), Food (dbl), Income (int),
## Life_expectancy (dbl), Region (fctr), Period (fctr)
It was Bosnia and Herzegovina in 1993. This can be explained by the Bosnian war between 1992-1995.
Which country has the highest income per person per year in Europe?
## Source: local data frame [2 x 12]
## Groups: Country [1]
##
## Country Year Gender BMI_Index Bloodpressure Cholesterol Sugar Food
## (fctr) (fctr) (fctr) (dbl) (dbl) (dbl) (dbl) (dbl)
## 1 Norway 2004 female 25.47340 127.2539 5.373211 120.55 3458.47
## 2 Norway 2004 male 26.50614 134.8698 5.456786 120.55 3458.47
## Variables not shown: Income (int), Life_expectancy (dbl), Region (fctr),
## Period (fctr)
Norway is the country with the highest income in Europe. It was expected because Norway is export country of oil and gas.
How do the various countries score regarding income?
The countries are sorted by median Income. Countries in Eastern Europe have the lowest income among the European countries, followed by countries in Southern Europe. In Western and Northern Europe the income is high, highest is in Norway. The red dot in boxplot denote the average Income.
There are 1638 observations in the data set data_eu with 12 features. These are: Country, Year, Gender, BMI Index, Bloodpressure, Cholesterol, Sugar, Food, Income, Life_expectancy Region and Period. The variables Country, Year, Gender, Region and Period are ordered factor variables with the following levels.
Country is ordered alphabetic and have 40 levels: From Albania to United Kingdom
Year is ordered and have 25 levels: Year is between 1980-2004
Gender have two levels: Male and Female
Cholesterol is 5.385 which is already on the borderline to high risk.Region has four levels: Eastern Europe, Northern Europe, Southern Europe, and Western Europe
Period has three levels: 1980 - 1989, 1990 - 1999 and 2000 - 2004
The main features in the data set data_eu are Life_expectancy, Cholesterol, Bloodpressure and Income. I think that systolic blood pressure and cholesterol play an important role in human’s health and therefore on life expectancy. On the other hand I think that income has a positive impact on life expectancy since it enables a better health care and hence a longer life.
Other features which can help the investigation is Food, Sugar, BMI_Index, Region and Country. The daily amount of calories that a person intakes and the amount of sugar are important factors for the weight. One indicator of a person’s health is the BMI. Region divides Europe not only geographical but also economical that means depended on which region people live increase/decrease the chance to prolong the life. On the other hand Country is also a factor which impacts people’s health. If a country has a sophisticated health system and its population has equal access to the public health care services then the life expectancy is also higher.
Yes, I did. I added into the data set the variable Region to order the European countries by region, since region is not only a geographical distinction but also indicates a different living affluence. I also added into the data set the variable Period in order to investigate the change of life expectancy over time.
The data set Cholesterol is a merging data set of Cholesterol data set for male and female respective. The data set Systolic Blood Pressure is a merging data set of Systolic Blood Pressure data set for male and female respective. The data set BMI is a merging data set of BMI data set for male and female respective.
The remaining data sets do not distinct between male and female, they have one value for both gender.
The first scatterplot shows the relationship between life expectancy vs. cholesterol in Europe between 1980-2004.
As older people get, as higher are their cholesterol values. This is true until the cholesterol of around 5.4 mmol/L, age standardized mean. Then the life expectancy decreases as cholesterol get higher. The scatterplot shows also that people live shorter even when they do not have high cholesterol values. This is contrary to the proclaimed statement that cholesterol should be below 5.2 mmol/L, age standardized mean.
The quantiles of data_eu for 10%, 50% and 90% are represented by dark green, red and purple dashed line respective. The mean is shown by a black dashed line. 10%, 50% and 90% of people with cholesterol value below 5.2, which corresponds to normal cholesterol, have a life expectancy under 68.2, 71.8 and 79.4 years respectively. But looking at the group of people with cholesterol values equal or greater than 5.2, which means borderline high risk and high risk for many diseases, the life expectancy is under 75.4, 77.6 and 80.1 respective.
The results show that in general the group of people with high cholesterol lives longer than people with normal cholesterol values.
The correlation of Life_expectancy and Cholesterolis 0.52.
## [1] 0.5206328
The variables Life_expectancy and Cholesterol are moderately related.
Next I look closer at life expectancy by region and country.
People in Eastern Europe have the lowest median and in average life expectancy in Europe.
People in Kazakhstan have in average the lowest life expectancy in Europe.
Below is mean value of life expectancy in Eastern, Southern, Western and Northern Europe.
## [1] 70.55423
## [1] 75.77543
## [1] 76.5725
## [1] 77.1672
The median value of life expectancy in Eastern, Southern, Western and Northern Europe are:
## [1] 70.8
## [1] 76.8
## [1] 76.65
## [1] 77.2
The life expectancy is in average lowest in Eastern European. Followed by Southern and Western Europe. The highest in average life expectancy has the Northern Europe. The median life expectancy in Eastern Europe is the lowest in Europe, followed by Western, Southern, and Northern Europe.
In Eastern Europe the mean and median value of cholesterol is under the recommended value of 5.2 mmol/L, age standardized mean. Red dot point denote the mean value of cholesterol.
The boxplot below visualizes this.
Below is the mean cholesterol values in Eastern, Southern, Western and Northern Europe.
## [1] 5.082286
## [1] 5.323818
## [1] 5.700119
## [1] 5.736535
The median cholesterol values in Eastern, Southern, Western and Northern Europe:
## [1] 5.129331
## [1] 5.357579
## [1] 5.705645
## [1] 5.751593
The cholesterol value in Eastern Europe is in average the lowest of all other regions. Followed by Southern, Western and Northern Europe. The highest cholesterol value in average has the Northern Europe.
Eastern Europe has in average the lowest life expectancy and the lowest cholesterol value. Followed by Southern, Western and Northern Europe. Northern Europe has in average the highest life expectancy and the highest in average cholesterol value. This is contrary to what I expected.
The first scatterplot shows the relationship between life expectancy vs. systolic blood pressure.
Few countries have desired systolic blood pressure value.
## Country Year Gender BMI_Index Bloodpressure Cholesterol Sugar Food
## 782 Iceland 2004 female 25.67006 119.9665 5.631419 153.43 3310.98
## Income Life_expectancy Region Period
## 782 37482 81.1 Northern Europe 2000 - 2004
Only women had desired systolic blood pressure in Iceland in 2004. The most European countries have prehypertension systolic blood pressure. With a systolic blood pressure greater than 134 mm Hg the life expectancy decreases.
There are few data values with systolic blood pressure equal or less than 126 mm Hg. I make some adjustments in the x-axis and focus on this area.
At 120 mm Hg the maximum life expectancy is reached. Thereafter life expectancy decreases as systolic blood pressure increases.
As next step I look closer the data_eu but for systolic values between 126 mm Hg and 139 mm Hg, because the most data is concentrated in this interval.
Only 10% of the Europeans have a life expectancy below 70 years, 50% of Europeans have a life expectancy below 77.6 and 95% below 80 years. The quantile 95%, 50% and the mean value decreases as the systolic blood pressure increases. But contrary, the quantile of 10% increases as the systolic blood pressure increases.
Create a subset of 0.1 quantile, that means systolic blood pressure between 127 mm Hg-139 mm Hg and life expectancy below 70 years. Thereafter I create boxplot Country vs. Life_expectancy.
## Country
## 1 Armenia
## 2 Azerbaijan
## 3 Belarus
## 4 Bosnia and Herzegovina
## 5 Estonia
## 6 Georgia
## 7 Hungary
## 8 Kazakhstan
## 9 Latvia
## 10 Moldova
## 11 Romania
## 12 Russia
## 13 Turkey
## 14 Ukraine
All countries except of Turkey are Eastern European countries. This can also been shown below with a boxplot.
Characteristic of this data set is that the information is taken from 1992 except in case of Turkey and Hungary which is taken from 1980.
But what about the life expectancy with systolic blood pressure greater than 139 mm Hg?
Some few countries have a systolic blood pressure greater than 139 mm Hg. These are Finland, Norway, Ireland, Germany and Hungary.
## Country
## 1 Finland
## 2 Germany
## 3 Hungary
## 4 Ireland
## 5 Norway
The scatterplot shows the relationship between life expectancy and income. That could be interpreted as high income prolongs life especially until the life expectancy of 76. Thereafter the life expectancy grows much slower than the increase of income.
The correlation between Life_expectancyand Income is 0.76, which is strong.
The mean and median of income in each European region are shown below. The mean income values in Eastern, Southern, Western, and Northern Europe are:
## [1] 9826.147
## [1] 21548.51
## [1] 32402.58
## [1] 34303.06
The median income values in Eastern, Southern, Western, and Northern Europe are:
## [1] 9794
## [1] 21442
## [1] 31854
## [1] 32297
Eastern Europe has the lowest in average and median income, followed by Southern, Western and Northern Europe.
## Source: local data frame [20 x 4]
##
## Country mean_income median_income n
## (fctr) (dbl) (dbl) (int)
## 1 Moldova 2757.615 2596 26
## 2 Armenia 2824.846 2636 26
## 3 Georgia 3152.231 3185 26
## 4 Albania 4440.640 4281 50
## 5 Azerbaijan 4610.077 4459 26
## 6 Bosnia and Herzegovina 4676.692 5609 26
## 7 Ukraine 5684.846 5305 26
## 8 Belarus 7061.385 6879 26
## 9 Macedonia, FYR 8154.385 8229 26
## 10 Bulgaria 9666.160 10088 50
## 11 Kazakhstan 10126.077 9706 26
## 12 Latvia 10360.846 9912 26
## 13 Romania 11751.800 11449 50
## 14 Poland 11944.320 11212 50
## 15 Russia 13495.538 13173 26
## 16 Estonia 13682.923 13705 26
## 17 Croatia 14689.385 14652 26
## 18 Slovak Republic 14860.000 15062 24
## 19 Hungary 16988.200 16989 50
## 20 Slovenia 20757.231 20585 26
## Source: local data frame [5 x 4]
##
## Country mean_income median_income n
## (fctr) (dbl) (dbl) (int)
## 1 Finland 28341.84 27282 50
## 2 Iceland 29179.64 28629 50
## 3 Sweden 31473.16 30596 50
## 4 Denmark 34966.28 34008 50
## 5 Norway 47554.40 45742 50
Moldova has the lowest in average and median income in Eastern Europe and Norway the highest ones.
Before I investigate other features I want to look closer at the correlations between them.
## BMI_Index Bloodpressure Cholesterol Sugar
## BMI_Index 1.00000000 0.13814532 -0.1843664 -0.02879355
## Bloodpressure 0.13814532 1.00000000 0.1630146 -0.07668586
## Cholesterol -0.18436641 0.16301460 1.0000000 0.59618776
## Sugar -0.02879355 -0.07668586 0.5961878 1.00000000
## Food 0.07789474 -0.15094300 0.4399585 0.49576125
## Income -0.07565821 -0.18894575 0.6348446 0.56375438
## Life_expectancy 0.01562241 -0.21458422 0.5206328 0.44626758
## Food Income Life_expectancy
## BMI_Index 0.07789474 -0.07565821 0.01562241
## Bloodpressure -0.15094300 -0.18894575 -0.21458422
## Cholesterol 0.43995853 0.63484464 0.52063276
## Sugar 0.49576125 0.56375438 0.44626758
## Food 1.00000000 0.56796435 0.41764922
## Income 0.56796435 1.00000000 0.76416961
## Life_expectancy 0.41764922 0.76416961 1.00000000
The correlation matrix shows the relationships between the variables in data_eu. Life expectancy correlates strongly with income and moderately with cholesterol, sugar and food. The relationship between life expectancy and income is the strongest among the variables. Unexpected the systolic blood pressure has almost no impact on life expectancy (correlation = 0.02). This seems really unusual. High systolic blood pressure causes serious heart diseases and strokes which leads often to death and shortens life expectancy. cholesterol correlates moderately with income and sugar, and sugar correlates moderately with income and food.
The scatterplot matrix below shows the overview of all variables plots and the correlation between them.
Next I will look closer at scatterplots involving life expectancy with sugar and food. Thereafter the scatterplots income with the variables cholesterol. At the end the scatterplot sugar and cholesterol.
According to World Health Organization the daily intake of sugar should be less than 5% of the daily food intake. The daily food consumption is in average 2400 calories (women: 2100 calories, men: 2700 calories). That means that the daily sugar consumption is 120 calories. Which is equivalent to 31 grams per day.
The sugar consumption in Europe is above the recommended value. It is interesting to know which countries have a healthy sugar consumption, which have minimum and which have maximum value.
## Region Country Year
## 1 Eastern Europe Armenia 1993
## Region Country Year
## 1 Eastern Europe Armenia 1992
## 2 Eastern Europe Armenia 1993
## 3 Eastern Europe Armenia 1994
## 4 Eastern Europe Azerbaijan 1993
## 5 Eastern Europe Azerbaijan 1994
## 6 Eastern Europe Azerbaijan 2000
## 7 Eastern Europe Bosnia and Herzegovina 1994
## 8 Eastern Europe Bosnia and Herzegovina 1995
## 9 Eastern Europe Bosnia and Herzegovina 1996
## 10 Eastern Europe Georgia 1993
## 11 Eastern Europe Georgia 1994
## Region Country Year
## 1 Eastern Europe Estonia 2003
## 2 Eastern Europe Estonia 2004
Lowest sugar intake has Armenia in 1993.
Healthy sugar intake have the following countries:
The highest sugar intake has Estonia between 2003-2004.
All above countries are Eastern European countries.
It looks like that as higher the daily food intake is as older people get.
The median life expectancy increases until 1991. Then it decreases and after 1992 increases again.
Below are the media value of life expectancy in Eastern, Southern, Western and Northern Europe,
## [1] 70.8
## [1] 76.8
## [1] 76.65
## [1] 77.2
and the median value of life expectancy in periods: 1980 - 1989, 1990 - 1999, 2000 - 2004.
## [1] 74.7
## [1] 75.2
## [1] 75.95
The results shows that median life expectancy by regions is lowest in Eastern Europe, followed by Western, Southern and Northern Europe. Interesting is the life expectancy in Southern Europe, which is higher than Western Europe.
The median life expectancy by periods is lowest between 1980 and 1989. The median life expectancy increases by the following periods.
The cholesterol value increases as the income increases. This is true until the income get the value of 27500 $PPP then the cholesterol decreases as income increases. A possible reason to cholesterol fall is that people with higher income take medicine to lower it or that people with higher income try to reduce cholesterol intake.
The cholesterol values increase as sugar consumption increases. The turning point is the value of 120 grams sugar. After this the cholesterol decreases as sugar consumption increases. This can be explained by medicine intake against cholesterol.
## [1] Country Year Gender BMI_Index
## [5] Bloodpressure Cholesterol Sugar Food
## [9] Income Life_expectancy Region Period
## <0 rows> (or 0-length row.names)
None Country has normal cholesterol and healthy sugar consumption between 1980-1990.
## Country Year Gender BMI_Index Bloodpressure Cholesterol
## 1 Armenia 1992 female 26.09322 134.0298 5.067329
## 2 Armenia 1992 male 24.12982 134.9878 4.908696
## 3 Armenia 1993 female 25.99631 133.6498 5.016787
## 4 Armenia 1993 male 24.05854 134.5899 4.858117
## 5 Armenia 1994 male 24.02297 134.2892 4.811986
## 6 Armenia 1994 female 25.93440 133.3841 4.972368
## 7 Azerbaijan 1993 male 24.76250 133.1220 4.829249
## 8 Azerbaijan 1993 female 26.57865 130.3269 4.983414
## 9 Azerbaijan 1994 female 26.48774 129.9787 4.929545
## 10 Azerbaijan 1994 male 24.69113 132.6836 4.772017
## 11 Bosnia and Herzegovina 1994 male 25.09667 131.4639 4.663593
## 12 Bosnia and Herzegovina 1994 female 25.17425 132.7103 4.762538
## 13 Bosnia and Herzegovina 1995 male 25.07033 131.1335 4.630342
## 14 Bosnia and Herzegovina 1995 female 25.19614 132.6593 4.730610
## 15 Bosnia and Herzegovina 1996 female 25.28831 132.7398 4.718254
## 16 Bosnia and Herzegovina 1996 male 25.12328 131.0250 4.617823
## 17 Georgia 1993 male 24.82921 137.3146 4.881995
## 18 Georgia 1993 female 25.70626 132.9283 5.029236
## 19 Georgia 1994 male 24.73249 136.8331 4.814095
## 20 Georgia 1994 female 25.56143 132.5144 4.966557
## Sugar Food Income Life_expectancy Region Period
## 1 13.70 1833.98 1973 69.4 Eastern Europe 1990 - 1999
## 2 13.70 1833.98 1973 69.4 Eastern Europe 1990 - 1999
## 3 5.48 1868.46 1842 69.2 Eastern Europe 1990 - 1999
## 4 5.48 1868.46 1842 69.2 Eastern Europe 1990 - 1999
## 5 21.92 1945.81 1988 69.2 Eastern Europe 1990 - 1999
## 6 21.92 1945.81 1988 69.2 Eastern Europe 1990 - 1999
## 7 30.14 2217.92 4806 65.1 Eastern Europe 1990 - 1999
## 8 30.14 2217.92 4806 65.1 Eastern Europe 1990 - 1999
## 9 30.14 2100.15 3808 65.1 Eastern Europe 1990 - 1999
## 10 30.14 2100.15 3808 65.1 Eastern Europe 1990 - 1999
## 11 13.70 2724.51 1574 70.6 Eastern Europe 1990 - 1999
## 12 13.70 2724.51 1574 70.6 Eastern Europe 1990 - 1999
## 13 13.70 2737.26 1976 67.1 Eastern Europe 1990 - 1999
## 14 13.70 2737.26 1976 67.1 Eastern Europe 1990 - 1999
## 15 27.40 2851.18 3771 73.0 Eastern Europe 1990 - 1999
## 16 27.40 2851.18 3771 73.0 Eastern Europe 1990 - 1999
## 17 30.14 1569.77 2410 69.7 Eastern Europe 1990 - 1999
## 18 30.14 1569.77 2410 69.7 Eastern Europe 1990 - 1999
## 19 16.44 1776.61 2181 70.7 Eastern Europe 1990 - 1999
## 20 16.44 1776.61 2181 70.7 Eastern Europe 1990 - 1999
Only few countries in Eastern Europe have normal cholesterol values and healthy sugar consumption between 1990-99. These are:
## Country Year Gender BMI_Index Bloodpressure Cholesterol Sugar Food
## 1 Azerbaijan 2000 male 24.51287 131.2280 4.534855 27.4 2406.22
## 2 Azerbaijan 2000 female 26.34661 128.5312 4.692858 27.4 2406.22
## Income Life_expectancy Region Period
## 1 4459 68 Eastern Europe 2000 - 2004
## 2 4459 68 Eastern Europe 2000 - 2004
Only people in Azerbaijan in 2000 have normal cholesterol values and healthy sugar consumption.
During 1980-2004 all European regions have in average BMI index above the normal. Lowest have the countries in Northern Europe, followed by Western, Eastern and Southern Europe.
Since 1980 the mean and median of BMI index is increasing. Only between 1980-1989 the BMI in average normal. Red dot in boxplot denotes the mean value.
It is clear to see that the food and the sugar consumption in Europe is mostly above the recommended values.
Life_expectancy which is one of my main features correlates strongly with Income. The correlation between them is 0.7641696. Life_expectancy correlates with Cholesterol moderately and has almost no relationship with Bloodpressure, (correlation = -0.2145842). Income correlates moderately with Choresterol. The correlation between them is quite high, 0.6348446. The relationship between Choresterol and Sugar is moderate.
The correlation between Life_expectancy and Sugar or Life_expectancy with Food is also moderate.
Looking at the correlation matrix the BMI has a very weak correlation with all features. I expected a strong relation between them, since BMI categorize people as underweight, normal weight, overweight. That means BMI indicates in which health condition people are.
The median income and cholesterol increases in all periods and is lowest in Eastern Europe, followed by Southern, Western, and Northern Europe. The median life expectancy increased in all three periods and is lowest in Eastern Europe, followed by Western, Southern and Northern Europe.
During 1980-2004 all regions in Europe have in average BMI above the normal. Lowest BMI the Northern Europe, followed by Western, Eastern and Southern Europe. Since 1980 the BMI index is increasing. Only between 1980-1989 the BMI in average is normal. The daily food and sugar consumption per person in Europe is over the recommended values.
The strongest relationship I found is between Life_expectancy and Income. Both are main features.
The life expectancy in Europe increased. People with high income live longer than people with low income. Small changes in low income results to big changes in life expectancy. But changes to high income do not have a high impact on live expectancy.
Life expectancy in Southern, Western and Northern Europe increased during all periods. In Eastern Europe mean life expectancy decreased between 1990-1999 and the difference between minimum and maximum life expectancy increased over the periods. Interesting is the increase and high level of life expectancy in Southern Europe, even when the income is not as high as in Western and Northern Europe.
It seems that there is almost no correlation between life expectancy and Income in Northern Europe between 1990-1999. This might be explained by a very good health system independed from people’s income.
More information about the development of income and life expectancy in Europe in different years are shown in graph below.
Generally life expectancy in most European countries increased during the analyzed time period. In some Eastern European countries life expectancy decreased slightly. For example in Russia, even when the income increased since 1998 life expectancy decreased.
In Ukraine, Moldova, Romania, Macedonia (FYR), the income decreased but the life expectancy stayed almost at the same levels or decreased slightly between 1980-2004. The decrease of life expectancy in Bosnia Herzegovina can be explained by the war in the beginning of ’90s.
Eastern European countries have the lowest income, followed by Southern European countries.
Mean life expectancy in Western, Northern and Southern Europe increased. Cholesterol values for these countries decreases as the ellipses are moving to the left.
In Eastern Europe mean life expectancy decreases between 1990-1999 but increased in the following years. Cholesterol values decrease for that region. In the 1980s the ellipse is small and grows over the next periods. That means that the countries in that region are developing differently.
In most European countries the life expectancy increases when cholesterol decreases. This is valid also for both genders. Only in few Eastern countries such as Belarus, Russia and Ukraine life expectancy decreases when cholesterol values decrease.
## Country Year Gender BMI_Index Bloodpressure
## Russia :26 1992 : 2 female:13 Min. :24.95 Min. :128.4
## Albania : 0 1993 : 2 male :13 1st Qu.:24.99 1st Qu.:129.0
## Armenia : 0 1994 : 2 Median :25.92 Median :131.0
## Austria : 0 1995 : 2 Mean :25.79 Mean :130.4
## Azerbaijan: 0 1996 : 2 3rd Qu.:26.48 3rd Qu.:131.4
## Belarus : 0 1997 : 2 Max. :26.81 Max. :133.0
## (Other) : 0 (Other):14
## Cholesterol Sugar Food Income
## Min. :4.929 Min. : 90.41 Min. :2827 Min. :11173
## 1st Qu.:5.046 1st Qu.: 98.63 1st Qu.:2884 1st Qu.:11925
## Median :5.163 Median :106.85 Median :2926 Median :13173
## Mean :5.172 Mean :107.06 Mean :2958 Mean :13496
## 3rd Qu.:5.269 3rd Qu.:117.81 3rd Qu.:3032 3rd Qu.:14629
## Max. :5.496 Max. :120.55 Max. :3143 Max. :16967
##
## Life_expectancy Region Period
## Min. :63.60 Eastern Europe :26 1980 - 1989: 0
## 1st Qu.:64.90 Northern Europe: 0 1990 - 1999:16
## Median :65.20 Southern Europe: 0 2000 - 2004:10
## Mean :65.64 Western Europe : 0
## 3rd Qu.:66.20
## Max. :68.00
##
Until 1994 life expectancy in Russia decreased. Between 1994-1998 it increased and then it decreased again. During this period the cholesterol values were normal. Women have higher cholesterol values than men, but their life expectancy is on the same level.
It seems to me that the negative change of life expectancy has less with the cholesterol to do than with political, economical and social situation in the country during that period.
Even when women have in most countries higher cholesterol values than men, they have almost the same life expectancy.
## Country Year Gender BMI_Index Bloodpressure
## Albania : 50 1993 : 40 female:319 Min. :23.87 Min. :128.4
## Bulgaria: 50 1994 : 40 male :319 1st Qu.:25.04 1st Qu.:131.1
## Hungary : 50 1995 : 40 Median :25.44 Median :132.9
## Poland : 50 1996 : 40 Mean :25.47 Mean :133.1
## Romania : 50 1997 : 40 3rd Qu.:25.89 3rd Qu.:135.2
## Armenia : 26 1998 : 40 Max. :27.07 Max. :139.1
## (Other) :362 (Other):398
## Cholesterol Sugar Food Income
## Min. :4.502 Min. : 5.48 Min. :1570 Min. : 1466
## 1st Qu.:4.924 1st Qu.: 60.27 1st Qu.:2727 1st Qu.: 5124
## Median :5.129 Median : 84.93 Median :2926 Median : 9794
## Mean :5.082 Mean : 86.29 Mean :2921 Mean : 9826
## 3rd Qu.:5.273 3rd Qu.:111.64 3rd Qu.:3182 3rd Qu.:13705
## Max. :5.554 Max. :167.12 Max. :3755 Max. :25694
##
## Life_expectancy Region Period
## Min. :63.00 Eastern Europe :638 1980 - 1989:100
## 1st Qu.:68.83 Northern Europe: 0 1990 - 1999:338
## Median :70.80 Southern Europe: 0 2000 - 2004:200
## Mean :70.55 Western Europe : 0
## 3rd Qu.:72.80
## Max. :76.80
##
## Country Year Gender BMI_Index Bloodpressure
## Cyprus :50 1980 : 14 female:175 Min. :23.66 Min. :123.0
## Greece :50 1981 : 14 male :175 1st Qu.:25.01 1st Qu.:127.5
## Italy :50 1982 : 14 Median :25.57 Median :130.5
## Malta :50 1983 : 14 Mean :25.61 Mean :130.5
## Portugal:50 1984 : 14 3rd Qu.:26.09 3rd Qu.:133.1
## Spain :50 1985 : 14 Max. :28.01 Max. :138.1
## (Other) :50 (Other):266
## Cholesterol Sugar Food Income
## Min. :4.722 Min. : 65.75 Min. :2758 Min. : 7828
## 1st Qu.:5.249 1st Qu.: 76.71 1st Qu.:3217 1st Qu.:15854
## Median :5.358 Median : 84.93 Median :3405 Median :21442
## Mean :5.324 Mean : 93.78 Mean :3361 Mean :21549
## 3rd Qu.:5.462 3rd Qu.: 95.20 3rd Qu.:3541 3rd Qu.:26226
## Max. :6.127 Max. :153.43 Max. :3713 Max. :36962
##
## Life_expectancy Region Period
## Min. :62.70 Eastern Europe : 0 1980 - 1989:140
## 1st Qu.:74.50 Northern Europe: 0 1990 - 1999:140
## Median :76.80 Southern Europe:350 2000 - 2004: 70
## Mean :75.78 Western Europe : 0
## 3rd Qu.:78.30
## Max. :80.80
##
## Country Year Gender BMI_Index
## Austria : 50 1980 : 16 female:200 Min. :23.74
## Belgium : 50 1981 : 16 male :200 1st Qu.:24.67
## France : 50 1982 : 16 Median :25.13
## Germany : 50 1983 : 16 Mean :25.20
## Ireland : 50 1984 : 16 3rd Qu.:25.72
## Netherlands: 50 1985 : 16 Max. :27.34
## (Other) :100 (Other):304
## Bloodpressure Cholesterol Sugar Food
## Min. :120.9 Min. :5.303 Min. : 90.41 Min. :3094
## 1st Qu.:128.5 1st Qu.:5.552 1st Qu.:109.59 1st Qu.:3316
## Median :132.0 Median :5.706 Median :117.81 Median :3433
## Mean :131.7 Mean :5.700 Mean :120.10 Mean :3439
## 3rd Qu.:135.3 3rd Qu.:5.831 3rd Qu.:126.03 3rd Qu.:3569
## Max. :140.0 Max. :6.241 Max. :164.38 Max. :3817
##
## Income Life_expectancy Region Period
## Min. :16078 Min. :72.40 Eastern Europe : 0 1980 - 1989:160
## 1st Qu.:26758 1st Qu.:75.30 Northern Europe: 0 1990 - 1999:160
## Median :31854 Median :76.65 Southern Europe: 0 2000 - 2004: 80
## Mean :32403 Mean :76.57 Western Europe :400
## 3rd Qu.:37425 3rd Qu.:78.00
## Max. :49882 Max. :81.00
##
## Country Year Gender BMI_Index Bloodpressure
## Denmark:50 1980 : 10 female:125 Min. :23.38 Min. :120.0
## Finland:50 1981 : 10 male :125 1st Qu.:24.68 1st Qu.:128.7
## Iceland:50 1982 : 10 Median :25.05 Median :132.4
## Norway :50 1983 : 10 Mean :25.08 Mean :131.8
## Sweden :50 1984 : 10 3rd Qu.:25.49 3rd Qu.:135.3
## Albania: 0 1985 : 10 Max. :26.73 Max. :143.1
## (Other): 0 (Other):190
## Cholesterol Sugar Food Income
## Min. :5.098 Min. : 90.41 Min. :2901 Min. :21965
## 1st Qu.:5.575 1st Qu.:112.33 1st Qu.:3089 1st Qu.:27789
## Median :5.752 Median :120.55 Median :3154 Median :32297
## Mean :5.737 Mean :126.09 Mean :3169 Mean :34303
## 3rd Qu.:5.900 3rd Qu.:139.73 3rd Qu.:3250 3rd Qu.:37941
## Max. :6.192 Max. :164.38 Max. :3458 Max. :62370
##
## Life_expectancy Region Period
## Min. :73.70 Eastern Europe : 0 1980 - 1989:100
## 1st Qu.:75.80 Northern Europe:250 1990 - 1999:100
## Median :77.20 Southern Europe: 0 2000 - 2004: 50
## Mean :77.17 Western Europe : 0
## 3rd Qu.:78.50
## Max. :81.10
##
Median life expectancy in Eastern Europe is the lowest in Europe. Followed by Western, Southern and Northern Europe. Income in Eastern Europe is in average the lowest as well the cholesterol values.
In Northern and Western Europe the income is very high as well the cholesterol values. But still the people live longer.
In Southern Europe the income is lower than Western and Northern Europe but median life expectancy is higher than in Western Europe. It seems that not only income has an impact on life expectancy but also the country in which people live.
In the boxplot below more detailed information is given.
The plot verifies that the life expectancy in Eastern Europe is lowest in Europe. The yellow dot in each boxplot denotes the average value life expectancy.
##
## Call:
## lm(formula = data_eu$Life_expectancy ~ data_eu$Income)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.3006 -1.7113 0.2188 1.8447 5.6592
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.878e+01 1.292e-01 532.42 <2e-16 ***
## data_eu$Income 2.486e-04 5.187e-06 47.92 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.611 on 1636 degrees of freedom
## Multiple R-squared: 0.584, Adjusted R-squared: 0.5837
## F-statistic: 2296 on 1 and 1636 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = data_eu$Life_expectancy ~ data_eu$Income + data_eu$Cholesterol)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.4357 -1.7355 0.1785 1.8662 5.7525
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.546e+01 1.158e+00 56.517 < 2e-16 ***
## data_eu$Income 2.363e-04 6.699e-06 35.274 < 2e-16 ***
## data_eu$Cholesterol 6.666e-01 2.308e-01 2.888 0.00393 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.605 on 1635 degrees of freedom
## Multiple R-squared: 0.5861, Adjusted R-squared: 0.5856
## F-statistic: 1157 on 2 and 1635 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = data_eu$Life_expectancy ~ data_eu$Income + data_eu$Cholesterol +
## data_eu$Country)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.6002 -0.4134 0.0395 0.4255 4.7727
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 9.016e+01 1.139e+00 79.174
## data_eu$Income 1.862e-04 7.958e-06 23.398
## data_eu$Cholesterol -3.603e+00 2.232e-01 -16.143
## data_eu$CountryArmenia -2.775e+00 2.248e-01 -12.346
## data_eu$CountryAustria 1.444e-01 3.893e-01 0.371
## data_eu$CountryAzerbaijan -6.993e+00 2.270e-01 -30.809
## data_eu$CountryBelarus -4.417e+00 2.313e-01 -19.093
## data_eu$CountryBelgium 8.980e-01 4.113e-01 2.183
## data_eu$CountryBosnia and Herzegovina -1.141e+00 2.267e-01 -5.034
## data_eu$CountryBulgaria -1.718e+00 2.065e-01 -8.322
## data_eu$CountryCroatia -9.280e-02 2.482e-01 -0.374
## data_eu$CountryCyprus 2.855e+00 2.977e-01 9.590
## data_eu$CountryDenmark -4.196e-01 4.262e-01 -0.984
## data_eu$CountryEstonia -3.960e+00 2.648e-01 -14.954
## data_eu$CountryFinland 1.226e+00 3.833e-01 3.198
## data_eu$CountryFrance 2.225e+00 3.900e-01 5.706
## data_eu$CountryGeorgia -2.033e+00 2.264e-01 -8.984
## data_eu$CountryGermany 6.726e-01 4.244e-01 1.585
## data_eu$CountryGreece 1.758e+00 2.648e-01 6.641
## data_eu$CountryHungary -4.186e+00 2.388e-01 -17.527
## data_eu$CountryIceland 4.373e+00 4.258e-01 10.270
## data_eu$CountryIreland 3.313e-01 3.593e-01 0.922
## data_eu$CountryItaly 1.011e+00 3.403e-01 2.972
## data_eu$CountryKazakhstan -1.030e+01 2.253e-01 -45.708
## data_eu$CountryLatvia -4.202e+00 2.468e-01 -17.026
## data_eu$CountryMacedonia, FYR -1.751e-01 2.229e-01 -0.785
## data_eu$CountryMalta 4.249e+00 3.064e-01 13.866
## data_eu$CountryMoldova -4.871e+00 2.242e-01 -21.728
## data_eu$CountryNetherlands 1.167e+00 4.019e-01 2.903
## data_eu$CountryNorway -1.185e+00 5.081e-01 -2.332
## data_eu$CountryPoland -1.252e+00 2.216e-01 -5.651
## data_eu$CountryPortugal 3.815e-03 2.730e-01 0.014
## data_eu$CountryRomania -3.784e+00 2.054e-01 -18.424
## data_eu$CountryRussia -8.398e+00 2.477e-01 -33.903
## data_eu$CountrySlovak Republic -1.055e+00 2.618e-01 -4.029
## data_eu$CountrySlovenia 4.123e-01 2.942e-01 1.402
## data_eu$CountrySpain 2.400e+00 3.057e-01 7.849
## data_eu$CountrySweden 2.464e+00 3.893e-01 6.327
## data_eu$CountrySwitzerland 7.660e-02 4.869e-01 0.157
## data_eu$CountryTurkey -6.762e+00 1.862e-01 -36.312
## data_eu$CountryUkraine -5.620e+00 2.221e-01 -25.309
## data_eu$CountryUnited Kingdom 1.841e+00 3.898e-01 4.723
## Pr(>|t|)
## (Intercept) < 2e-16 ***
## data_eu$Income < 2e-16 ***
## data_eu$Cholesterol < 2e-16 ***
## data_eu$CountryArmenia < 2e-16 ***
## data_eu$CountryAustria 0.71071
## data_eu$CountryAzerbaijan < 2e-16 ***
## data_eu$CountryBelarus < 2e-16 ***
## data_eu$CountryBelgium 0.02916 *
## data_eu$CountryBosnia and Herzegovina 5.36e-07 ***
## data_eu$CountryBulgaria < 2e-16 ***
## data_eu$CountryCroatia 0.70849
## data_eu$CountryCyprus < 2e-16 ***
## data_eu$CountryDenmark 0.32504
## data_eu$CountryEstonia < 2e-16 ***
## data_eu$CountryFinland 0.00141 **
## data_eu$CountryFrance 1.38e-08 ***
## data_eu$CountryGeorgia < 2e-16 ***
## data_eu$CountryGermany 0.11319
## data_eu$CountryGreece 4.27e-11 ***
## data_eu$CountryHungary < 2e-16 ***
## data_eu$CountryIceland < 2e-16 ***
## data_eu$CountryIreland 0.35667
## data_eu$CountryItaly 0.00300 **
## data_eu$CountryKazakhstan < 2e-16 ***
## data_eu$CountryLatvia < 2e-16 ***
## data_eu$CountryMacedonia, FYR 0.43228
## data_eu$CountryMalta < 2e-16 ***
## data_eu$CountryMoldova < 2e-16 ***
## data_eu$CountryNetherlands 0.00375 **
## data_eu$CountryNorway 0.01981 *
## data_eu$CountryPoland 1.89e-08 ***
## data_eu$CountryPortugal 0.98885
## data_eu$CountryRomania < 2e-16 ***
## data_eu$CountryRussia < 2e-16 ***
## data_eu$CountrySlovak Republic 5.86e-05 ***
## data_eu$CountrySlovenia 0.16123
## data_eu$CountrySpain 7.61e-15 ***
## data_eu$CountrySweden 3.23e-10 ***
## data_eu$CountrySwitzerland 0.87501
## data_eu$CountryTurkey < 2e-16 ***
## data_eu$CountryUkraine < 2e-16 ***
## data_eu$CountryUnited Kingdom 2.52e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9121 on 1596 degrees of freedom
## Multiple R-squared: 0.9505, Adjusted R-squared: 0.9492
## F-statistic: 746.8 on 41 and 1596 DF, p-value: < 2.2e-16
The relationship between life expectancy and income is strong. Having high income increases also life expectancy. But an important factor is also in which country people live. If people live in countries with sophisticated health system they live longer even when their cholesterol values are above the normal and the income is not so high.
Women have almost the same life expectancy as men even when they have higher cholesterol values than men.
Life expectancy in Eastern European countries is lower than the other European regions. The highest life expectancy is in Northern Europe, followed by Southern Europe and Western Europe.
Interesting are the results of Southern Europe. Even when people’s income is lower than in Western and Northern European countries they still have a high life expectancy.
Country is a variable which plays an important role in life expectancy.
Yes, I did. I created a linear model using the variables Life_expectancy and Income. Into the linear model I added the variables Cholesterol and Country. This model gave a very high R^2 value, equal to 0.95.
The addition of Cholesterol variable into the model improved the R^2 value only 0.002, (R^2 = 0.586). Adding the variable Country into the model the R^2 increased to 0.95.
The result above can interpreted as:
Life expectancy exploration is the subject of this project. To understand the distribution of it is fundamental for this exploration. The distribution of life expectancy in Europe between 1980-2004 is negative skewed. The mean value is 74.15 and median value 75. Life expectancy of 76 years has the highest frequency.
Income has the highest correlation to life expectancy of all features. To visualize the relationship I use a scatterplot. The smooth curve increases strongly between 10000 and 23750 $ PPP. This could be interpreted as high income prolongs life especially until a life expectancy of 76 years. Thereafter life expectancy grows much slower than the increase of income.
The life expectancy depends not only on income as it has been shown before, but also on the country in which people are living. Country has the strongest influence on the prediction and that is the reason showing it in the final diagram. Low income countries have lower life expectancy. The Eastern European countries and Turkey are the countries in Europe with the lowest life expectancy. Countries in Southern Europe have relatively high life expectancy compared to the income.
To examine the life expectancy in Europe I created the data set data_eu. Which is a collection of data sets taken from Gapminder. I explored the Life_expectancy across the main features Cholesterol, Bloodpressure and Income. But also across other features such as Sugar, Food, Country, Region, Period and Gender.
Using scatterplots I found interesting relations between life expectancy, income and cholesterol in different European regions. Median Income and Cholesterol were lowest in Eastern Europe, followed by Southern, Western and Northern Europe, which had the highest values. For median Life_expectancy the order is different. Eastern Europe had the lowest value, followed by Western, Southern and Northern Europe. Southern Europe has a relatively higher Life_expectancy than their median Income and Cholesterol values would suggest. The Mediterranean diet might be a reason for that.
I was surprised that systolic blood pressure had no impact on life expectancy as I assumed. Perhaps medical treatment of high systolic blood pressure is a reason for that.
I created a linear model based on Income, Cholesterol and Country which gave a very high R^2. Surprisingly the variable Cholesterol had a weak impact on the prediction model of only two tenth of a percent.
In some cases I had to take the rounded value of Cholesterol,Sugar,Food and Bloodpressure in order to see a clear pattern.
The data set contains data for forty countries from 1980 until 2004. Most Eastern European countries delivered information after 1992. It would be interesting to have newer data to see more recent developments.
I believe that life expectancy depends not only on income, cholesterol and country. Especially the feature country should be taken a closer look at. It gives a combined value for many factors. Such factors are for example mortality, environment disaster, epidemic diseases, total health spending by country and gender. These features should be taken in account for a deeper investigation.